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To be, or not to be: that is the question: 
Whether 'tis nobler in the mind to suffer 
The slings and arrows of outrageous fortune. 
Or to take arms against a sea of troubles, 
' And by opposing end them? 

Q i Hamlet (III, i) 

^ ' Abstract 

We argue that the tools of decision theory should be taken more seriously in the specification 
and analysis of systems. We illustrate this by considering a simple problem involving reliable 
^ \ communication, showing how considerations of utility and probability can be used to decide 

■ when it is worth sending heartbeat messages and, if they are sent, how often they should be 
\ sent. 

■ Keywords: decision theory, specifications, design and analysis of distributed systems 
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: 1 Introduction 

o 

In designing and implementing systems, choices must always be made: When should we garbage 
' collect? Which transactions should be aborted (to remove a deadlock)? How big should the 

. page table be? How often should we resend a message that is not acknowledged? Currently, 

5^ I these decisions seem to be made based on intuition and experience. However, studies suggest 



that decisions made in this way are prone to inconsistencies and other pitfalls [EIS89|. Just as 
we would like to formally verify critical programs in order to avoid bugs, we would like to apply 
formal methods when making important decisions in order to avoid making suboptimal decisions. 
Mathematical logic has given us the tools to verify programs, among other things. There are also 



standard mathematical tools for making decisions, which come from decision theory [Res87|. We 
believe that these tools need to be taken more seriously in systems design. We view this paper as 
a first step towards showing how this can be done and the benefits of so doing. 

Before we delve into the technical details, let us consider a motivating example. Suppose Alice 
made an appointment with Bob and the two are supposed to meet at five. Alice shows up at five 
on the dot but Bob is nowhere in sight. At 5:20, Alice is getting restless. The question is "To stay 
or not to stay?" The answer, of course, is "It depends." Clearly, if Bob is an important business 
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client and they are about to close a deal, she might be willing to wait longer. On the other hand, 
if Bob is an in-law she never liked, she might be happy to have an excuse to leave. At a more 
abstract level, the utility of actually having the meeting is (or, at least, should be) an important 
ingredient in Alice's calculations. But there is another important ingredient: likelihood. If Alice 
and Bob meet frequently, she may know something about how prompt he is. Does he typically 
arrive more or less on time (in which case the fact that he is twenty minutes late might indicate 
that he is unlikely to come at all) or is he someone who quite often shows up half an hour late? 
Not surprisingly, utilities and probabilities (as measures of likelihood) are the two key ingredients 
in decision theory. 

While this example may seem far removed from computer systems, it can actually be viewed as 
capturing part of atomic commitment | SKS97|] . To see this, suppose there is a coordinator pc and 
two other processes pa and ph working on a transaction. To commit the transaction, the coordinator 
must get a yes vote from both pa and pb- Suppose the coordinator gets a yes from pa, but hears 
nothing from p^. Should it continue to wait or should it abort the transaction? The types of 
information we need to make this decision are precisely those considered in the Alice-Bob example 
above: probabilities and utilities. While it is obvious that the amount of time Alice should wait 
depends on the situation, atomic commit protocols typically have a context-independent timeout 
period. If pc has not heard from all the processes by the end of the timeout period, then the 
transaction is aborted. Since the importance of the transaction and the cost of waiting are context- 
dependent, the timeout period would not be appropriate in every case. 

Although it is not done in atomic commit protocols, there certainly is an awareness that we 
need to take utilities or costs into account elsewhere in the database literature.^ For example, 
when a deadlock is detected in a database system, some transaction(s) must be rolled back to 
break the deadlock. How do we decide which ones? The textbook response [3KS97, p. 497] is 
that "[we] should roll back those transactions that will incur the minimum cost. Unfortunately, 
the term minimum cost is not a precise one." Typically, costs have been quantified in this context 
by considering things like how long the transaction has been running and how much longer it is 
likely to run, how many data items it has used, and how many transactions will be involved in a 
rollback. This is precisely the type of analysis to which the tools of decision theory can be applied. 
Ultimately we are interested in when each transaction of interest will complete its task. However, 
some transactions may be more important than others. Thus, ideally, we would like to attach 
a utility to each vector of completion times. Of course, we may be uncertain about the exact 
outcome (e.g., the exact running time of a transaction). This is one place where likelihood enters 
the picture. Thus, in general, we will need both probabilities and utilities to decide which are the 
most appropriate transactions to abort. Of course, obtaining the probabilities and utilities may in 
practice be difficult. Nevertheless, we may often be able to get reasonable estimates of them (see 
Section |6| for further discussion of this issue), and use them to guide our actions. 

In this paper, we illustrate how decision theory can be used and some of the subtleties that 
arise in using it. We focus on one simple problem involving reliable communication. For ease of 
exposition, we make numerous simplifying assumption in our analysis. Despite these simplifying 
assumptions, we believe our results show that decision theory can be used in the specification and 
design of systems. 

We are not the first to attempt to apply decision theory in computer science. Shenker and his 
colleagues [ BBS98| , BS98 |, for example, have used ideas from decision theory to analyze various 



^Awareness of cost is by no means limited to the database community. For example, a sampling of the pa- 
pers a t a recent DISC ( Distributed Computing) Conference, showed tha t cost was mentioned in at least seven of 
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network protocols; Microsoft has a Decision Theory and Adaptive Systems group that has success- 
fuhy used decision theory in a number of appUcations, including troubleshooting problems with 
printers and intelligent user interfaces in Office '97. (See http://research.microsoft.com/dtas/ 
for further details.) Mikler et al. [ MHW96[ have looked at network routing from a utility-theoretic 
perspective. One important difference between our paper and theirs is that they do not treat the 
utility function as a given: Their aim is to find a good utility function so that the routing algorithm 
would exhibit the desired behavior (of avoiding the hot spot). More generally, our focus on writing 
specifications in terms of utility, and the subtleties involved with the particular application we 
consider here — reliable communication — make the thrust of this paper quite different from others 
in the literature. 

The rest of this paper is organized as follows. We briefly review some decision-theoretic concepts 
in Section |2[ In Section ^ we describe the basic model and introduce the communication problem 
that serves as our running example. We show that the expected cost of even a single attempt at 
reliable communication is infinite if there is uncertainty about process failures. We then show in 
Section Q how we can achieve reliable communication with finite expected cost by augmenting our 
system with heartbeat messages, in the spirit of Aguilera, Chen, and Toueg [ ACT97[| . However, the 
heartbeat messages themselves come at a cost; this cost is investigated in Section |5[ We offer some 
conclusions in Section ^. Some proofs are relegated to the appendix. 



2 A Brief Decision Theory Primer 

The aim of decision theory is to help agents make rational decisions. There are a number of 
equivalent ways of formalizing the decision process. In this paper, we assume that (a) we have a set 
O of possible states of the world or outcomes, (b) the agent can assign a utility from RU {oo, —00} 
(denoted R*) to each outcome in O, and (c) each action or choice a of the agent can be associated 
with a subset Oa of O and a probability measure Pra on O^. (This is essentially equivalent to 
viewing Pra as a probability measure on O which assigns probability to the outcomes in O — O3.) 

Roughly speaking, the utility associated with an outcome measures how happy the agent would 
be if that outcome occurred. Thus, utilities quantify the preferences of the agent. The agent prefers 
outcome oi to outcome 02 iff the utility of oi is higher than that of 02. The set Oa of outcomes 
associated with an action or choice a are the outcomes that might arise if a is performed or chosen; 
the probability measure on Oa represents how likely each outcome is if a is performed. These are 
highly nontrivial assumptions, particularly the last two. We discuss them (and to what extent they 
are attainable in practice) in Section ^. For now, though, we just focus on their consequences. 

Recall that a random variable on the set O of outcomes is a function from O to R*. Given a 
random variable X and a probability measure Pr on the outcomes, the expected value of X with 
respect to Pr, denoted E^''(X), is Y^v(zx{o) ^'Pi'(-'^ = v), where X{0) is the range of X and X = v 
denotes the set {o E O : X{o) = v}. We drop the superscript Pr if it is clear from the context. 
Note that utility is just a random variable on outcomes. Thus, with each action or choice, we have 
an associated expected utility, where the expectation is taken with respect to Oa and Pra. Since 
utilities can be infinite, we need some conventions to handle infinities in arithmetic expressions. If 
X > 0, we let X ■ ±00 = ±00; if x < 0, we let x ■ ±00 = ^00. For all x € R, we let x + ±00 = ±00. 
Finally, we let • 00 = 0. We assume that -|- and • remain commutative on R*, so this covers all 
the cases but cxd -|- (—00), which we take to be undefined. 

The "rational choice" is typically taken to be the one that maximizes expected utility. While 
other notions of rationality are clearly possible, for the purposes of this paper, we focus on expected 
utility maximization. Again, see Section ^ for further discussion of this issue. 
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We can now apply these notions to the Ahce-Bob example from the introduction. One way 
of characterizing the possible outcomes is as pairs {ma,mb), where rua is the number of minutes 
that Alice is prepared to wait, and nib is the time that Bob actually arrives. (If Bob does not 
arrive at all, we take = oo.) Thus, if rria > rrib, then Alice and Bob meet at time rrib in the 
outcome {ma^mb). If ma < mb, then Alice leaves before Bob arrives. What is the utility of the 
outcome {ma,mb)'! Alice and Bob may well assign different utilities to these outcomes. Since we 
are interested in Alice's decision, we consider Alice's utilities. A very simple assumption is that 
there is a fixed positive benefit meet-Bob to Alice if she actually meets Bob and a cost of c-wait 
for each minute she waits, and that these utilities are additive. We assume here that c-wait < 0. 
(In general, costs are described by non-positive utilities.) Under this assumption, the utility of the 
outcome {ma,mb) is meet-Bob -|- mfeC-wait if ma > mb and maC-wait if ma < mb- 

Of course, in practice, the utilities might be much more complicated and need not be additive. 
For example, if Alice has a magazine to read, waiting for the first fifteen minutes might be relatively 
painless, but after that, she might get increasingly frustrated and the cost of waiting might increase 
exponentially, not linearly. The benefit to meeting Bob may also depend on the time they meet, 
independent of Alice's frustration. For example, if they have a dinner reservation for 6 p.m. at a 
restaurant half an hour away, the utility of meeting Bob may drop drastically after 5:30. Finally, 
the utility of (m^, m^) might depend on even if < mb- For example, Alice might feel happier 
leaving at 5:15 if she knew that Bob would arrive at 6:30 than if she knew he would arrive at 5:16. 

Once Alice has decided on a utility function, she has to decide what action to take. The only 
choice that Alice has is how long to wait. With each choice m^, the set of possible outcomes consists 
of those of the form (ma, mb), for all possible choices of mb- Thus, to compute the expected utility 
of the choice ma, she needs a probability measure over this set of outcomes, which effectively means 
a probability measure over Bob's possible arrival times. 

This approach of deciding at the beginning how long to wait may seem far removed from actual 
practice, but suppose instead Alice sent her assistant Cindy to meet Bob. Knowing something 
about Bob's timeliness (or lack thereof), she may well want to give Cindy instructions for how long 
to wait. Taking the cost of waiting to be linear in the amount of time that Cindy waits is now 
not so unreasonable, since while Cindy is tied up waiting for Bob, she is not able to help Alice in 
other ways. If Cindy goes to meet Bob frequently for Alice, it may make more sense for Alice just 
to tell Cindy her utility function, and let Cindy decide how long to wait based on the information 
she acquires regarding Bob's punctuality. Of course, once we think in terms of Alice sending an 
assistant, it is but a small step to think of Alice running an application, and giving the application 
instructions to help it decide how to act. 

3 Reliable Communication 

We now consider a problem that will serve as a running example throughout the rest of the paper. 
Consider a system consisting of a sender p and a receiver q connected by an unreliable bidirectional 
link. We assume that the link satisfies the following properties: 

• The transmission delay of the link is r. 

• The link can only fail by losing (whole) messages and the probability of a message loss is 7. 

We assume that the transmission delay and the probability of message loss are independent of the 
state of the system.! A process is correct if it never crashes. For x G {p, q}, let ax be the probability 

■^The results of this paper hold even if these quantities do depend on the state of the link. For example, 7 may be 
a function of the number of messages in transit. We stick to the simpler model for ease of exposition. 
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that X is correct (more precisely, the probabihty of the set of runs in which x is correct). In runs 
in which x is not correct, x crashes in each time unit with probabihty Px > 0, independent of ah 
other events in the system (such as the events that occurred during the previous time unit). 

The assumptions that seems most reasonable to us is that otp = ag = 0: in practice, there is 
always a positive probability that a process will crash in any given round.El We allow the possibility 
that a^; 7^ to facilitate comparison to most of the literature, which does not make probabilistic 
assumptions about failure. It also may be a useful way of modeling the scenario in which processes 
stay up forever "for all practical purposes" (for example, if the system is scheduled to be taken 
off-line before the processes crash). 

We want to implement a reliable link on top of the unreliable link provided by the system. 
That is, we want to implement a reliable send-receive protocol SR using the (unreliable) sends and 
receives provided by the link, denoted send and receive. SR is a joint protocol, consisting of a SEND 
protocol for the sender and a RECEIVE protocol for the receiver. SR can be initiated by either p or 
q. A send-receive protocol is said to be sender-driven if it is initiated by p and receiver- driven if 
it is initiated by q. (Web browsing can be viewed as an instance of a receiver-driven activity. The 
web browser queries the web server for the content of the page.) We assume that sends and receives 
take place at a time t, while SENDs and RECEIVES take place over an interval of time (since, in 
general, they may involve a sequence of sends and receives). 

We assume that send and receive satisfy the following two properties: 

• If q receives m at time t, then p sent m at time t — r and m was not lost (since the link cannot 
create messages or duplicate messages and the transmission delay is known to be r). 

• li p sends m at time t, then with probability 1 — 7, g will receive m at time t + r; if g does 
not receive m at time t + t, q will never receive it. 

What specification should SR satisfy? Clearly we do not want the processes to create messages out 
of whole cloth. Thus, we certainly want the following requirement: 

So- If q finishes RECEIVing m at time t, then p must have started SENDing m at some time t' < t 
and q must have received m at some time t" < t. 

We shall implicitly assume So without further comment throughout the paper. 

The more interesting question is what liveness requirements SR should satisfy. Perhaps the 
most obvious requirement is: 

Si. If p and q are correct and SR is started with m as the message, then q eventually finishes 
RECEIVing m. 

Although Si is very much in the spirit of typical specifications, which focus only on what happens 
if processes are correct, we would argue that it is rather uninteresting, for two reasons (which 
apply equally well to many other similar specifications). The first shows that it is too weak: If 
ctp = ctq = 0, then p and q are correct (i.e., never crash) with probability 0. Thus, specification 
Si is rather uninteresting in this case: It is saying something about a set of runs with vanishingly 
small likelihood. The second problem shows that Si is too strong: In runs where p and q are 
correct, there is a chance (albeit a small one) that the link may lose all messages. In this case, q 
cannot finish RECEIVing m, since it cannot receive m (as all the messages are lost). Thus Si is not 
satisfied. 

■^We assume that round k takes place between time k — 1 and k. 
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Of course, both of these problems are weh known. The standard way to strengthen Si to deal 
with the first problem is to require only that p and q be correct for "sufficiently long", but then 
we need to quantify this; it is far from clear how to do so. The standard way to deal with the 
second problem is to restrict attention to fair runs, according to some notion of fairness [ Fra86| ] , and 



require only that q finishes RECEIVing m in fair runs. Fairness is a useful abstraction for helping 
us characterize conditions necessary to prove certain properties. However, what makes fairness of 
practical interest is that, under reasonable probabilistic assumptions, it holds with probability 1. 

Our interest here, as should be evident from the introduction, is to make more explicit use of 
probability in writing a specification. For example, we can write a probabilistic specification like 
the following: 

52. lim^^oo Pr('7 finishes RECEIVing m no later than t time units after the start of SR | p and q 
are up t time units after the start of SR) = 1. 

Requirement S2 avoids the two problems we saw with Si. It says, in a precise sense, that if p and 
q are up for sufficiently long, then q will RECEIVE m with high probability (where "sufficiently 
long" is quantified probabilistically). Moreover, by making only a probabilistic statement, we do 
not have to worry about unfair runs: They occur with probability 0. 

The traditional approach has been to separate specifying the properties that a protocol must 
satisfy from the problem of finding the best algorithm that meets the specification. But that 
approach typically assumes that properties are all-or-nothing propositions. That is, it implicitly 
assumes that a desirable property must be true in every run (or perhaps every fair run) of a 
protocol. It does not allow a designer to specify that it may be acceptable for a desirable property 
to sometimes fail to hold, if that results in much better properties holding in general. We believe 
that, in general, issues of cost should not be separated from the problem of specifying the behavior 
of an algorithm. A protocol that satisfies a particular traditional specification may do so at the 
price of having rather undesirable behavior on a significant fraction of runs. For example, to ensure 
safety, a protocol may block 20% of the time. There may be an alternate protocol that is unsafe 
only 2% of the time but also blocks only 2% of the time. Whether it is better to violate safety 
2% of the time and liveness 2% of the time or to never violate safety but violate liveness 20% of 
the time obviously depends on the context. The problem with the traditional approach is that this 
comparison is never even considered (any algorithm that does not satisfy safety is automatically 
dismissed). 

While we believe S2 is a better specification of what is desired than Si, it is still not good enough 
for our purposes, since it does not take costs into account. Without costs, we still cannot decide if 
it is better to violate liveness 20% of the time or to violate safety 2% of the time and liveness 2% 
of the time. As a first step to thinking in terms of costs, consider the following specification: 

53. For each message m, the expected cost of SR(m) is finite. 

As stated, S3 is not well defined, since we have not specified the cost function. We now consider a 
particularly simple cost function, much in the spirit of the Alice-Bob example discussed in Section]^. 
Let SR be a send-receive protocol. Its outcomes are just the possible runs or executions. We want 
to associate with each run its utility. There are two types of costs we will take into account: sending 
messages and waiting. The intuition is that each attempt to send a message consumes some system 
resources and each time unit spent waiting costs the user. The total cost is a weighted sum of the 
two. 

More precisely, let c-send and c-wait be constants representing the cost of sending a message 
and of waiting one time unit, respectively. Given a run r, let #-send(r) be the number (possibly 
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oo) of sends done by the protocol in run r. We now want to define t-wait{r), which intuitively 
is the amount of time q spends waiting to RECEIVE m. When should we start counting? In the 
Alice-Bob example, it was clear, since Alice starts waiting for Bob at 5:00. We do not want to 
start counting at a fixed time, since we do not assume that the processes will start their protocol 
at a particular time. What we want is to start at the time when SR is invoked. When do we 
stop counting, assuming we started? If there are no process crashes, then we stop counting when q 
finishes RECEIVing m. What if there are process crashes? In traditional specifications (such as Si), 
the protocol has no obligations once a process fails. To facilitate comparison between our approach 
and the traditional approach, we stop counting at the time of a process crash if it happens before 
q finishes RECEIVing m. (Note that q may never finish RECEIVing if a process crashes.) 

Let tg be the time SR is invoked. (If no such time exists, we let t-wait{r) = 0.) Let tp be 
the time p crashes (tp = oo if p does not crash); let tq be the time q crashes [tq = oo if g does 
not crash); let tj be the time q finishes RECEIVing m (t/ = oo if g does not finish). Finally let 
t-wait{r) = m.SLx{ioam{tp,tq,tf},ts} — ts- We take the (total) cost of run r to be 



Note that co is a random variable on runs. If Co(r) captures the cost of run r (as we are assuming 
here it does), then S3 says that we want E(co) = E(^-send)c-send + 'E{t-wait)c-wait to be finite. 

Note that, if SR is not invoked in a run r, then Co(r) = 0. Since we are interested in the 
expected cost of SR, we consider only runs in which SR is actually invoked. Also, since we are 
interested in the expected cost of a single invocation in this (and the next) section, we assume for 
ease of exposition that the protocol is invoked at time (so t-wait{r) = mm{tp,tq,tf}) throughout 
these two sections without further comment. 

Proposition 3.1: S2 and S3 are incomparable under cost function Cq. 

Proof: Suppose = ct^ = 1. Consider a send-receive protocol SRq in which p sends m in every 
round until it receives ack(to), and q sends its kth ACK(m) A^'^ rounds after receiving m for the 
kth time, where A7 > 1. (Recall that 7 is the probability of message loss.) It is easy to see that 
SRq satisfies S2. We show that it does not satisfy S3 by showing that E(#-send) = 00. 

The basic idea is that q is not acknowledging the receipt of m in a timely fashion, so p will send 
too many copies of m. Let = {r : g's first k ACKs are lost and the (k + l)st ACK makes it in r}; 
let = {r ■ ah of q's ACKs are lost}. Note that Pr(Afc) = 7'^(1 - 7) and Pr(74oo) = (so we can 
ignore runs in for the purpose of computing expected cost, since we adopted the convention 
that • 00 = 0). Note also that E(#-send | A^) > N'', since p cannot possibly get its first ACK(m) 
before time A^*^ in runs in A^. Thus 



It is clear that the last sum is not finite, since A7 > 1; thus the algorithm fails to satisfy S3. 

Suppose ap = aq = 0. Consider the trivial protocol (i.e., the "do nothing" protocol). In a 
round in which both p and q are up, one of p or g will crash in the next round with probability 
P = Pp + Pq — PpPq- So the probability that the first crash happens at time A; is (1 — Z?)*^/?. Thus 
one of them is expected to crash at time 



Co(^) = #-send(r)c-send + t-wait{r)c-wait. 



00 00 



E(#-send) = J2 E(#-send | Ak) Pr(Afc) > ^ aV(1 " t)- 



fe=0 k=0 




l-(3 
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(Here and elsewhere in this paper we use the weh-known fact that J2'^=okx'' = jf^^-) Thus, 

•^(*^o) = ^-^c-wait for the trivial protocol, so the trivial protocol satisfies S3, although it clearly 
does not satisfy 82- I 

The following theorem characterizes when S3 is implement able with respect to the cost function 
cq. Moreover, it shows that with this cost function, when S3 is satisfiable, there are in fact protocols 
that satisfy S3 and S2 simultaneously. 

Theorem 3.2: Under cost function cq, there is a send-receive protocol satisfying S3 iff ap = or 
aq = or aq = 1 or ctp = 1. Moreover, if ap = Q or = or aq = 1 or Up = 1, then there is a 
send-receive protocol that satisfies both S2 and S3. 

Proof: Suppose = 1 or = 0. Consider the (sender-driven) protocol SRi in which p sends m to 
q until ^? receives an ACK(m) from q, and q sends ACK(m) whenever it receives rn. SRi starts when 
p first sends m and q finishes RECEIVing m when it first receives m. To sec that SRi is correct, 
first consider the case that aq = 1. Let Cp = {r : p receives ACK(m) at least once from q in r}. Let 
Ni{r) = ki if the A;ith copy of m is the first received by q and let N2{r) = k2 if the A;2th copy of m 
is the one whose corresponding ACK(m) is the first received by p. 

Since the probability that the link may drop a particular message is 7, 



00 -.00 

E(iVi \Cp) = J2 ki'-Hi - 7) = ^ E 

k=l ^ k=l 



An analogous argument shows that 'E{N2 \ Cp) = (^i^^yi ■ Note that t-wait{r) = Ni{r) + t — 1 for 
r G Cp, so Ei{t-wait \ Cp) = E(iVi | Cp) + (r — 1) = jj^ + t— 1. Moreover, since p stops sending m 
when it receives ACK(m) from q, it will stop 2r rounds after the A'^2('")th send of m in run r. Thus 
(i-j)'^ +2r— 1 is the number of times p is expected to send minrunsofCp. We expect 1— 7 of these to 
be successful, so the number of times q is expected to send ACK(m) is at most (^i^^^ + (2t — 1)(1 — 7). 
(The actual expected value is slightly less since q may crash shortly after sending the first ack(to) 
received by p in runs of Cp). We conclude that E(#-send | Cp) < + (i-^y^ + (2''" ~ 1)(2 — 7)- 
Thus E(co I Cp) is finite, since both E(#-send | Cp) and Ei{t-wait \ Cp) are finite. 
We now turn to E(co | Cp). We first partition Cp into two sets: 

• Fi = {r : p crashes before receiving an ACK(m) from q} and 

• F2 = {r : p does not crash and does not receive ACK(m) from q}. 

Note that Pr(F2) = and Pr(Fi) = 1 — Pr(Cp). We may ignore runs of F2 for the purposes of 
computing the expected cost since we adopted the convention that • 00 = 0. In runs r of Fi, 
t-wait{r) is at most the time it takes for p to crash, which is expected to occur at time Thus 

E(t-wait I Fi) < Furthermore, if p crashes at time tc in r G .Fi, it sends m exactly tc times in 
r (since p does not receive ACK(m) in runs of Fi). In that case, q sends ACK(m) at most t^ times. 
So #-send(r) < 2tc if p crashes at time tc va. r ^ Fi. Thus E(#-send | Fi) < It follows that 

E(co I Cp) is finite. Since both E(co | Cp) and E(co | Cp) are finite, E(co) is finite; so SRi satisfies 
S3. To see that the protocol satisfies S2, note that for t>T, the probability that q does not finish 
RECEIVing m by time t given that both p and q are still up is 7*~^. Thus S2 is also satisfied. 



Now consider the case that ap = 0. Note that in this case, p is expected to crash at time 



Thus, 'Ei{t-wait) < ^ and E(#-send) < ^ (for the same reason as above), regardless of whether q 
is correct. Thus E(co) is again finite. The argument that S2 is satisfied is the same as before. 
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Now suppose Up = 1 or aq = 0. These cases are somewhat analogous to the ones above, except 
we need a receiver-driven protocol. Consider a protocol SR2 in which q queries p in every round 
until it gets a message from p. More precisely, let req denote a request message, q sends REQ to 
p every time unit until it receives m and p sends ni every time it receives REQ. SR2 starts when q 
sends the first REQ and q finishes RECEIVing m when q receives m for the first time. By reasoning 
similar to the previous cases, we can show that E(7^-send) and 'E{t-wait) are both finite (so S3 is 
satisfied) and that S2 is satisfied. 

We now turn to the negative result. It turns out that the negative result is much more general 
than the positive result. In particular, it holds for any cost function with a certain property. In 
the following, we use g =^ / to denote that if g{x) = cc then f{x) = 00. 

Lemma 3.3: Let c(r) be a cost function such that t-wait{r) c(r) and ^-send(r) =^4=- c(r). // 
< Op < 1 and < < 1, then for any send-receive protocol SR, Pr({r : c(r) = cxd}) > 0. 

Proof: Suppose SR is a send-receive protocol for p and q. Let Ri = {r : q crashes at time 
and p is correct in r}. Note that p will do the same thing in all runs in Ri: Either p stops 
sending after some time t ox p never stops sending, lip never stops, then 7^-send(r) = 00 for all 
r £ Ri. Since, by assumption, 7^-send(r) c{r), we have that c(r) = cxd for each r G Since 
Pr(i?i) = ap(l — aq)Pq > 0, we are done. Now suppose p stops sending after time t. Let R2 = {r : p 
crashes at time and q is correct in r}. Note that q will do the same thing in all runs of R2: Either 
q stops sending after some time t' or q never stops sending. If q never stops, then c(r) = 00 for all 
r G i?2 and Pr(i?2) = ctq{l — ap)(3p > 0, so again we are done. Finally, suppose that q stops sending 
at time t' in runs of R2- Let t" = 1 + max{t, i'}. Consider R^ = {r : both processes are correct 
and all messages up to time t" are lost in r}. Then t-wait{r) = 00 for all r €z R^. By assumption, 
t-wait{r) =^=^ c(r), so c(r) = 00 for all r ^ R^. Let Up and Uq be the number of invocations of send 
by p and q, respectively, in runs of i?3 (note that p and q do the same thing in all runs of R^). 
Then Pr(i?3) = apag7"*'^"'' > 0, completing the proof. I (Lemma |3.3D 



Clearly #-send(r) =^ co(r) and t-wait{r) =^ co(r), so Lemma 3^ applies immediately and we 
are done. | (Theorem |3.2D 

Of course, once we think in terms of utility-based specifications like S3, we do not want to 
know just whether a protocol implements S3; we are in a position to compare the performance 
of different protocols that implement S3 (or of variants of one protocol that all implement S3) by 
considering their expected utility. Let SRf and SR^ be generalizations (in the sense that they send 
messages every 5 rounds, where 5 need not be 1) of the sender-driven and receiver-driven protocols 
from Theorem |3.2| , respectively. Let SRjr denote the trivial (i.e., "do nothing") protocol. We use 
E^'^ to denote the expectation operator determined by the probability measure on runs induced 

Cpi5 

by using protocol SR. Thus, for example, E "(^-send) is the expected number of messages sent 
by SRf . If Up = aq = 0, then SRf , SR^, and SRjr all satisfy S3 (although SRtr does not satisfy S2). 
Which is better? 

In practice, process failures and link failures are very unlikely events. We assume in the rest of 

the paper that (ip, f3q, and 7 are all very small, so that we can ignore sums of products of these 

.2 



terms (with coefficients like 2r , 6, etc.). One way to formalize this is to say that products involving 
Pp, f5q, and 7 are 0(e) terms and 2r^, 6, etc., are 0(1) terms. We write ti t2 if \ti — t2\ is 0{e). 
Note that we do not assume expressions like ^ and ^ are small. 

Pq Pp 1-1 

For the following result only, we assume that not only are Pp and Pq 0{e), they are also 0(e),cl 
so that if or is multiplied by an expression that is O(e^), then the result is 0(e), which can 



'Recall that x is 0(e) iff x is 0(e) and a; ' is 0(e 
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then be ignored. 

Proposition 3.4: // = Qg = 0, then 

B^^^r^t-wazt) = ESR*^(#-send) = 0, 

B^^' (t-wait) ^ T, E^R' (#-sencl) w ^^^^ + 2 

E^R" {t-wait) « 2t, eSR' (#-send) « + 2 

Proof: The relatively straightforward (but tedious!) calculations are relegated to the appendix. | 

Note that the expected cost of messages for SRf is the same as that for SR^, except that the 
roles of (3p and I3q are reversed. The expected time cost of SR^ is roughly r higher than that of SRf , 
because q cannot finish RECEIVing m before time 2r with a receiver-driven protocol, whereas q may 
finish RECEIVing m as early as r with a sender-driven protocol. This says that the choice between 
the sender-driven and receiver-driven protocol should be based largely on the relative probability of 
failure of p and q. It also suggests that we should take 5 very large to minimize costs. (Intuitively, 
the larger 5 is, the lower the message costs in the case that q crashes before acknowledging p's 
message.) This conclusion (which may not seem so reasonable) is essentially due to the fact that 
we are examining a single invocation of SR in isolation. As we shall see in Section |5|, this conclusion 
is no longer justified once we consider repeated invocations of SR. Finally, note that if the cost of 
messages is high and waiting is cheap, the processes are better off (according to this cost function) 
using SRtr-. 

Thus, as far as S3 is concerned, there are times when SRjr is better than SRf or SR^ . How much 
of a problem is it that SRtr does not satisfy S2? Our claim is that if this desideratum (i.e., S2) is 
important, then it should be reflected in the cost function. While the cost function in our example 
does take into account waiting time, it does not penalize it sufficiently to give us S2. It is not too 
hard to find a cost function that captures S2. For example, suppose we take ci(r) = jv*"™**(''\ 
where N{1 -f5p-f5q + Pppg) > 1. 

Proposition 3.5: Under cost function Ci, S3 implies S2. 

Proof: Suppose SR is a protocol that does not satisfy S2; we show it does not satisfy S3 (under 
cost function ci). Let Cp{t) and Cg{t) consist of those runs of SR where p and q, respectively, are 
up for t time units after the start of SR (and perhaps longer). Let Rq{t) consist of the runs of 
SR where q finishes RECEIVing m no later than time t units after the start of SR. Since SR does 
not satisfy S2, there exists e > and an increasing infinite sequence of times toi^i) • • •) such that 
Pr(i?g(tj) I Cp{ti)r\Cq{ti)) > e for all i. We consider the case Op = = 1 and OpOq < 1 separately. 
Suppose Op = Oq = 1. Then Pr(Cp(t) R Cq{t)) = 1 for all t. So 

Pv{t-Wait > ti) = PT{Rq{ti)) = PT{Rq{ti) \ Cp{t,) H Cq{ti)) > E 

for all i. Let Vi = {r : t-wait{r) > ti} and Voo = {r : t-wait(r) = 00}. Note that Voo = fli^o ^^'^ 
that Vi D Vi' for i' > i. Thus Fi{Voo) = Pr(n£o ^i) > £• So E(ci) > Pv{Voo)N°° = 00. 

Now we turn to the case that UpUq < 1. Let W{t) = {r : t-wait{r) = t}. Note that t-wait{r) = 
ti + l for all runs r G Rq{ti) n Cp{ti + 1) n Cp{ti) n Cq{ti). Thus, 

Vi{w{t, + 1) I Cp{ti) n Cq{ti)) > Pr(Cp(f, + i)ni?;M I Cp{ti) n Cq{ti)). 

Given our independence assumptions regarding process failures, 

Pr(C7p(t, + l)ni?;(tO I Cp{ti) n Cq{ti)) = Pr(Cp(t, + 1) I Cp{ti)) ViiR^tT) I Cp{ti) n Cq{ti)) 

> (1 - ap)l3pe. 



2r 

T ' 

2t' 
5 ■ 
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A similar argument (exchanging the roles of Cp and Cq) shows that 



Y>v{W{U + 1) I Cp{ti) n Cq{U)) > (1 - a,)/3,e. 

So 

oo 

E(ci) > ^Pr(VK(A:))iV'= 



fc=0 

oo 



> ^Pr(H^(ti + l))A^*'+i 

2 = 

OO 

> Pr(^(** + 1) n Cp(ti) n Q(ti))iv*>+i 

2 = 

OO 

= Pr(w^(t2 + 1) I Cp{u) n c,(t,)) Pr(Cp(t,) n Cg(ti)))iv*'+' 

2 = 

OO 

> max{(l - ap)/?p, (1 - a,)/3Je^(l - /?p - /^^ + /3p/3g)*'iV*>+i. 

2=0 

Since (1 — /3p — /3q + PpPq)N > 1 by assumption, we are done. I 

The moral here is that S3 gives us the flexibility to specify what really matters in a protocol, 
by appropriately describing the cost function. We would like to remind the reader that the cost 
functions are not ours to choose: They reflect the user's preferences. (Thus we are not saying that 
ci is better than cq or vice versa, since each user is entitled to her own preferences.) What we are 
really saying here is that if S2 matters to the user, then her cost function would force S3 to imply 
S2 — in particular, her cost function could not be cq. 

4 Using Heartbeats 

We saw in Section ^ that S3 is not implementable if we are not certain about the correctness of 
the processes (i.e., if the probability that they are correct is strictly between and 1) and the 
cost function c(r) has the property that 7^-send(r) c(r) and t-wait{r) c(r). Aguilera, 
Chen, and Toueg [|ACT97 | (ACT from now on) suggest an approach that circumvents this problem. 



using heartbeat messages. Informally, a heartbeat from process i is a message sent by i to all other 
processes to tell them that it is still alive. ACT show that there is a protocol using heartbeats that 
achieves quiescent reliable communication; i.e., in every run of the protocol, only finitely many 
messages are required to achieve reliable communication (not counting the heartbeats). Moreover, 
they show that, in a precise sense, quiescent reliable communication is not possible if we are not 
certain about the correctness of the processes and communication is unreliable, a result much in the 
spirit of the negative part of Theorem |3.2| .EI In this section, we show that (using the cost function 
Co) we can use heartbeats to implement S3 for all values of ap and Oq. 

For the purposes of this paper, assume that processes send a message we call hbmsg to each 
other every 6 time units. Protocol SRhb in Figure |^ is a protocol for reliable communication based 
on act's protocol. (It is not as general as theirs, but it retains all the features relevant to us.) 
Briefly, what happens according to this protocol is that the failure detector layer of q sends hbmsg 
to the corresponding layer of p periodically. If p wants to SEND m, p checks to see if any (new) 



^ACT actually show that their impossibility result holds even if there is only one process failure, only finitely many 
messages can be lost, and the processes have access to S (a strong failure detector), which means that eventually 
every faulty process is permanently suspected and at least one correct process is never suspected. The model used by 
ACT is somewhat different from the one we are considering, but we can easily modify their results to fit our model. 
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The sender's protocol (SEND): 

1. while -ireceive(ACK(m)) do 

2. if receive(HBMSG) then 

3. send(m) 

4. fi 

5. od 



The receiver's protocol (RECEIVE): 

1. while true do 

2. if receive(m) then 

3. send(ACK(m)) 

4. fi 

5. od 



Figure 1: Protocol SR^b 



HBMSG has arrived; if so, p sends m to q, provided it has not already received ack(to) from q; q 
sends ACK(m) every time it receives m and q finishes RECEIVing m the first time it receives m. Note 
that q does not send any hbmsgs as part of SRhb- That is the job of the failure-detection layer, 
not the job of the protocol. (We assume that the protocol is built on top of a failure-detection 
service.) The cost function of the previous section does ot count the costs of hbmsgs. That is, 
since 7^-send(r) is the number of messages sent by the protocol, co(r) is not affected by the number 
of HBMSGS sent in run r. It is also worth noting that this is a sender-driven protocol, quite like that 
given in the proof of Theorem ^?^J§ It is straightforward to also design a receiver-driven protocol 
using heartbeats. 

We now want to show that SR^b implements S3 and get a good estimate of the actual expected 
cost. 

Theorem 4.1: Under cost function Cq, Protocol SRhb satisfies S3. Moreover, 'E{t-wait) ~ 2r and 
E(#-send) ^ 2 [^1 , so that E(co) 2TC-wait + 2 



c-send. 



Proof: Using arguments similar to those of the proof of Proposition 3.4, we can show that 



E{t-wait) 2r and E(#-send) 2 
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We leave details to the reader. | 



The analysis of SRhb is much like that of SR^ in Proposition 3.4. Indeed, in the case that 
ctp = ag = 0, the two protocols are almost identical. The waiting time is roughly r more for SRhb, 
since p does not start sending until it receives the first hbmsg from q. On the other hand, we 
are better off using SRhb if Q crashes before acknowledging p's message. In this case, with SRf, 
p continues to send until it crashes, while with SR^b, it stops sending (since it does not get any 
HBMSGS from q). This leads to an obvious question: Is it really worth sending heartbeats? Of 
course, if both Op and Og are between and 1, we need heartbeats or something like them to get 



around the impossibility result of Theorem 3.2. But if Op = ag = 0, then we need to look carefully 
at the relative size of c-send and c-wait to decide which protocol has the lower expected cost. 

This suggests that the decision of whether to implement a heartbeat layer must take probabilities 
and utilities seriously, even if we do not count either the overhead of building such a layer or the 
cost of heartbeats. What happens if we take the cost of heartbeats into account? This is the 
subject of the next section. 



®The reader might notice that the runs induced by this protocol actually resemble those of the receiver- driven 
protocol in the proof of Theorem p. 2] (if we identify hbmsg with req). The difference is that in the receiver-driven 
protocol in the proof of Theorem the protocol for the receiver actually sends the reqs whereas here the hbmsgs 
are sent not by the protocol but by an underlying heartbeat layer, independent of the protocol. 
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5 The Cost of Heartbeats 



In the previous section we showed that S3 is achievable with the help of heartbeats. When we 
computed the expected costs, however, we did so with the cost function cq, which does not count 
the cost of heartbeats. While someone who takes the heartbeat layer for granted (such as an 
application programmer or end-user) may have cq as their cost function, someone who has to decide 
whether to implement a heartbeat layer or how frequently heartbeats should be sent (such as a 
system designer) is likely to have a different cost function — one which takes the cost of heartbeats 
into account. 

As evidence of this, note that it is immediate from Theorem 4J that under the cost function 
Co, the choice of 5 that minimizes the expected cost is clearly at most 2t + 1. Intuitively, if we do 
not charge for heartbeats, there is no incentive to space them out. On the other hand, if we do 
charge for heartbeats, then typically we will be charging for heartbeats that are sent long after a 
given invocation of SRhb has completed. 

The whole point of having a heartbeat layer is that heartbeats are meant to be used, not just 
by one invocation of a single protocol, but by multiple invocations of (possibly) many protocols. 
We would expect that the optimal frequency of heartbeats should depend in part on how often the 
protocols that use them are invoked. The picture we have is that the SRhb protocol is invoked from 
time to time, by different processes in the system. It may well be that various invocations of it 
are running simultaneously. All these invocations share the heartbeat messages, so their cost can 
be spread over all of them. If invocations occur often, then there will be few "wasted" heartbeats 
between invocations, and the analysis of the previous subsection gives a reasonably accurate reading 
of the costs involved. On the other hand, if 5 is small and invocations are infrequent, then there 
will be many "wasted" heartbeats. We would expect that if there are infrequent invocations, then 
heartbeats should be spaced further apart. 

We now consider a setting that takes this into account. For simplicity, we continue to assume 
that there are only two processes, p and q, but we now allow both p and q to invoke SRhb- (It is 
possible to do this with n processes and more than one protocol, but the two-process and single 
protocol case suffices to illustrate the main point, which is that the optimal 5 should depend on 
how often the protocol is invoked.) We assume that each process, while it is running, invokes SRhb 
with probability a at each time unit. Thus, informally, at every round, each running process tosses 
a coin with probability of a of landing heads. If it lands heads, the process then invokes SRhb with 
the other as the recipient. (Note that we no longer assume that the protocol is invoked at time 
in this section.) 

Roughly speaking, in computing the cost of a run, we consider the cost of each invocation of 
SRhb together with the cost of all the heartbeat messages sent in the run. Our interest will then be 
in the cost per invocation of SRhb- Thus, we apportion the cost of the heartbeat messages among 
the invocations of SRhb- If there are relatively few invocations of SR/^b, then there will be many 
"wasted" heartbeat messages, whose cost will need to be shared among them. 

For simplicity, let us assume that each time SRhb is invoked, a different message is sent. (For 
example, messages could be numbered and include the name of the sender and recipient.) We say 
SRhbi'm) is invoked at time ti in r if at time ti some process x first executes line 1 of the code 
of the sender with message m. This invocation of SRhb completes at time t2 if the last message 
associated with the invocation (either a copy of m or a copy of ACK(m)) is sent at time t2- If x 
received the last heartbeat message from the receiver before invoking SR/if,(m), we take t2 = ti 
(that is, the invocation completes as soon as it starts in this case). 

The processes will (eventually) stop sending m or ACK(m) if either process crashes or if the 
sender receives ACK(m). Thus, with probability 1, all invocations of SRhb will eventually complete. 
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Let 7^-SR(r, t) be the number of invocations of SRhb that have completed by time t in r; let c-SR(r, t) 
be the cost of these invocations. Let c-HBMSG(r, t) be the cost of sending hbmsg up to time t in 
r. This is simply the number of hbmsgs sent up to time t (which we denote by 7^-HBMSG(r, t)) 
multiplied by c-send. Let c^°^^\r,t) = c-SR(r, t) + c-HBMSG(r, t). Finally, let 

_total/ j.\ 

c^^g(r) = limsup- ^ ' ^ 



t^oo #-SR(r,t) + l' 
where "limsup" denotes the limit of the supremum, that is, 



c'^^^(r) = lim sup 



^.total 



t'-^o<t<i' #-SR(r,t) + 1 

Thus c^^^{r) is essentially the average cost per invocation of SRhb, taking heartbeats into account. 
We write "limsup" instead of "lim" since the limit may not exist in general. (However, the proof 
of the next theorem shows that in fact, with probability 1, the limit does exist.) For the following 
result only, we assume that ^/J3^ and are also 0(e). 

Theorem 5.1: Under the cost function c^^^, Protocol SRhb satisfies S3. Furthermore, E(c'^^g) ~ 
((1 — ap)(l — aq)X + OpOq) (2 c-send + (^t + c-wait^ + j^c-send, where < A < 1. 

Proof: See the appendix. | 

Note that with this cost function, we have a real decision to make in terms of how frequently 
to send heartbeats. As before, there is some benefit to making 6 > 2r: it minimizes the number of 
redundant messages sent when SRhb is invoked (that is, messages sent by the sender before receiving 
the receiver's acknowledgment). Also, by making 5 larger we will send fewer heartbeat messages 
between invocations of SRhb- On the other hand, if we make 6 too large, then the sender may have 
to wait a long time after invoking SRhb before it can send a message to the receiver (since messages 
are only sent upon receipt of a heartbeat). Intuitively, the greater c-wait is relative to c-send, the 
smaller we should make 5. Clearly we can find an optimal choice for 6 by standard calculus. 

In the model just presented, if c-wait is large enough relative to c-send, we will take 5 to be 
1. Taking 5 this small is clearly inappropriate once we consider a more refined model, where there 
are buffers that may overflow. In this case, both the probability of message loss and the time for 
message delivery will depend on the number of messages in transit. The basic notions of utility 
still apply, of course, although the calculations become more complicated. This just emphasizes the 
obvious point is that in deciding what value (or values) 6 should have, we need to carefully look at 
the actual system and the cost function. 



6 Discussion 

We have tried to argue here for the use of decision theory both in the specification and the design 
of systems. Our (admittedly rather simple) analysis already shows both how decision theory can 
help guide the decision made and how much the decision depends on the cost function. None of our 
results are deep; the cost function just makes precise what could already have been seen from an 
intuitive calculation. But this is precisely the point: By writing our specification in terms of costs, 
we can make the intuitive calculations precise. Moreover, the specification forces us to make clear 

^By adding 1 to the denomina tor^ we guarantee it is never 0; adding 1 also simplifies one of the technical calculations 
needed in the proof of Theorem 5.1 
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exactly what the cost function is and encourages the ehcitation of utihties from users. We beheve 
that these are both important features. It is important for the user (and system designer) to spend 
time thinking about what the important attributes of the system are and to decide on preferences 
between various tradeoffs. 

A possible future direction is to study standard problems in the literature (e.g., Consensus, 
Byzantine Agreement, Atomic Broadcast, etc.) and recast the specifications in utility-theoretic 
terms. One way to do this is to replace a liveness requirement by an unbounded increasing cost 
function (which is essentially the "cost of waiting") and replace a safety requirement by a large 
penalty. Once we do this, we can analyze the algorithms that have been used to solve these 
problems, and see to what extent they are optimal given reasonable assumptions about probabilities 
and utilities. 

While we believe that there is a great deal of benefit to be gained from analyzing systems in 
terms of utility, it is quite often a nontrivial matter. Among the most significant difficulties are the 
following: 

1. Where are the utilities coming from? It is far from clear that a user can or is willing to assign 
a real-valued utility to all possible outcomes in practice. There may be computational issues 
(for example, the set of outcomes can be enormous) as well as psychological issues. While 
the agent may be prepared to assign qualitative utilities like "good", "fair", or "bad", he 
may not be prepared to assign 20.7. While to some extent the system can convert qualitative 
utilities to a numerical representation, this conversion may not precisely captures the user's 
intent. There are also nontrivial user-interface issues involved in eliciting utilities from users. 
In light of this, we need to be very careful if results depend in sensitive ways on the details 
of the utilities. 

2. Where are the probabilities coming from? We do not expect users to be experts at proba- 
bility. Rather, we expect the system to be gathering statistics and using them to estimate 
the probabilities. Of course, someone still has to tell the system what statistics to gather. 
Moreover, our statistics may be so sparse that we cannot easily obtain a reliable estimate of 
the probability. 

3. Why is it even appropriate to maximize expected utility? There are times when it is far 
from clear that this is the best thing to do, especially if our estimates of the probability 
and utility are suspect. For example, suppose one action has a guaranteed utility of 100 (on 
some appropriate scale), while another has an expected utility of 101, but has a nontrivial 
probability of having utility 0. If the probabilities and utilities that were used to calculate the 
expectation are reliable, and we anticipate performing these actions frequently, then there is 
a good case to be made for taking the action with the higher expected utility. On the other 
hand, if the underlying numbers are suspect, then the action with the guaranteed utility 
might well be preferable. 

We see these difBculties not as ones that should prevent us from using decision theory, but 
rather as directions for further research. It may be possible in many cases to learn a user's utility. 

Moreover, we expect that in many applications, except for a small region of doubt, the choice of 
which decision to make will be quite robust, in that perturbations to the probability and utility will 
not change the decision. Even in cases where perturbations do change the decision, both decisions 
will have roughly equal expected utility. Thus, as long as we can get somewhat reasonable estimates 
of the probability and utility, decision theory may have something to offer. 

Another important direction for research is to consider qualitative decision theory, where both 
utility and likelihood are more qualitative, and not necessarily real numbers. This is, in fact. 
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an active area of current research, as http://www.medg.lcs.mit.edu/qdt/bib/unsorted.bib (a 

bibhography of over 290 papers) attests. Note that once we use more quahtative notions, then we 
may not be able to compute expected utihties at all (since utilities may not be numeric) let alone 
take the action with maximum expected utility, so we will have to consider other decision rules. 

Finally, we might consider what would be an appropriate language to specify and reason about 
utilities, both for the user and the system designer. 

While it is clear that there is still a great deal of work to be done in order to use decision- 
theoretic techniques in systems design and specification, we hope that this discussion has convinced 
the reader of the utility of the approach. 
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Appendix: Proofs 

We present the proofs of Proposition |3.4| and Theorem K^. We repeat the statements of the results 



for the convenience of the reader. Recall that for Proposition |3.4| , we are assuming that f5p and Pq 
are both 0(e), and that for Theorem 5.1, we are assuming that ^/]3^ and \/^ are both 0(e). 

Proposition 3.4: // Op = aq = 0, then 



Fr"^" (t-wait) ^ r, 



l-{f3p+f3g-(3pf3g) 



ESRtr-(^.send) = 0, 



ESR'(#-send) 
ESR^(#-send) 



(r+l)/3p 



+ 2 



2t 

T 

2t 
<5 



Proof: For SRjr, note that #-send(r) = for all r, so E^'^*''(#-send) = 0. We also have that 
t-wait{r) is the time of the first crash in r. Since the probability of a crash during a time unit is 
P = Pp + Pq — PpPq^ we have that the expected time of the first crash, and hence 'E^^*'^ (t-wait), is 



J2Hi-pfp 



k=0 



Pjl-P) 

{i-{i-P)y 



l-P _l-{Pp+Pq- PpPq) 



P 



Pp+Pq- PpPq 



For SR^, we first show that E "(t-waii) ~ r. Since ap = Uq = 0, Vi{t-wait{r) = oo) = 0, thus 

Cp(5 

'Er {t-wait) = YlT=i kFi{t-wait = k). We break the sum into three pieces, 

r-l 

• ^ kFv{t-wait = k), 
k=l 

• Fr{t-wait = r), and 

oo 

• ^ kPT{t-wait = k), 

k=T+l 



16 



and analyze each one separately. 

For the first part, note that the only way that t-wait = for 1 < A; < r is for there to be a 
crash before r. Thus 



PT{t-wait = k) = ((1 - f3p){l - f3q))\pp + Pg- l3pPg) <pp + Pg. 



It follows that 



T— 1 T— 1 / -|\ 

J2 k Pr{t-wmt = k)< iPp + pg)J2k = (Pp + P<i) ~ ' ~ 0. 
k=l fe=l 

Thus we may drop the first part. 

For the second part, note that t-wait = r if p and q are up until r and q received the first copy 
of m p sent. (We may also have t-wait = r if one of p or g crashes at time r.) Thus, 

Vi{t-wait = t)> ((1 - (3p){l - (3q)y{l - 7) ~ 1, 

so the second part is ~ r. 

Finally, for the third part, if A; > r, then k has the form t + a6 + b, where a > and < 6 < (5 
(and a + 5 > 0). If t-wait = k = r + a5 + 6, then a + 1 messages are lost by the link, so 
Pv{t-wait = k) < 7"+-'^. A straightforward calculation shows that 

oo 5—1 

k Vvit-wait = k) = ^(T + h) V\:{t-wait = T + h) 

k=T+l b=l 

oo (5—1 

+ ^ ^(r + a(5 + 6) Pv{t-wait = T + a5-\-b) 

a=l 6=0 

oo 

< ^ 5(t + (a + l)(5)7"+i 

a=0 

oo 

< ^((a+l)52 + ,5r)7"+^ 

a=0 

oo oo 
a=l a=l 

0. 

Thus, we can also ignore the third part. This gives us (t-wait)) ~ r, as desired. 

Now let us turn to E^" s(#-send). Let us say that a send is successful iff the link does not 
drop the message (which could be an ack). Consider the set of runs A = {r : q successfully sends 
ACK(m) before crashing in r}. Roughly speaking, what happens is that in runs of A, p is receives 

times with 



ACK(m) at time 2r with probability ~ 1. In the meantime, p has sent m exactly 
probability ~ 1. With probability ~ 1, all of these are received by q; q in turn acknowledges all 
copies and thus E-' "(^-send \ A) ^ 2 -j- ; that is why this term appears in E "(^-send). In A, 
the expected value of #-send is very large, since p will send m until it crashes, so despite the low 
probability of A, it contributes the term ^^-j^^- We now turn to the details. 

We first compute Pr(A). Note that q can send ACK(m) only at times of the form r + kS. Let 
Bk = {t : q sends the first successful ack(to) at time r + k5}. Note that A = IJ^q and that 
BiDBj = $ Hi ^ j. Thus Pr(yl) = J2'k=o ^^iB^). Since q sends the first successful ACK(m) at time 
T -\- k6 in runs of B^, p must (successfully) send m at time kS in runs of B^. Thus 

FiiBk) = (1 - f5p)'''^\l - /3,)^+'=^+i(27 - 72)^(1 - 7)2. 
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The first factor reflects tlie fact that p must have been up at time k6 (to send m) while the second 
factor reflects the fact that q must have been up at time t + k6 (to receive m and send ACK(m)). 
The third factor reflects the fact that the previous k attempts have failed: either m was lost or the 
corresponding ACK(m) was lost, which occurs with probability (7 + (1 — 7)7) = 27 — 7^. The final 
factor reflects the fact that the {k + l)st attempt succeeded: both messages got through. So 

00 

Pr(A) = ^Pr(5fc) 

fc=0 

00 

= ^(1 - - /?,)^+'='+i(27 - 7')'(1 - if 

k=0 

00 

= (1 - /3p)(i - p,r+Hi - 7)' E(i - - /5«)''(27 - 7')' 

k=0 

- ^^-^-^^^-^-^"^^^-^ ^-(l-/^.)-Xl-W27-7-) 
= (1 - /3p)(l - P,r+\1 - 7)'((1 + 27) + 0{s')) 
= l-l3p-{T + l)Pg + 0{e^) 
^ 1. 

We now want to compute E-' "(i^-send | A). Again, we break E-* ^(^-send | A) into three pieces, 
• kPT{#-send = k \ A), 

k=0 



2|f |Pr(#-send = 2 



A), and 



• kPi{#-send = k \ A), 

and compute each part separately. 

Note that Pr(#-send = k \ A) < Pp + Pg + ior k < 2 

message is lost. Thus the first part is no more than 2 
For the second part, we have 



since either a process crashed or a 
(/3p + /?g + 7) « 0, so we may ignore it. 



Pr(#-send = 2 



since if p is up at time 2r, q is up at time 
ACK(m) got through, then #-send = 2 



6 + T, all of p's sends got through, and g's first 

We now turn our 



; thus the second part IS 2i 
attention to the last part. 

Note that p sends at least half the messages in every run r (whether r G A or r G A). Note also 
that, after the first successful attempt (that is, after the first message sent by p which is received by 
q whose corresponding acknowledgment is not lost by the link), p will send at most 



messages. 



since p would stop sending 2r time units after the first successful attempt (either because p received 



ACk(to) or p crashed). Combining the above two observations, we sec that if #-send(r) = 2 



for > 0, then p must have sent at least 



2t 

"3" 



+ 



messages and there are at least 



unsuccessful 



18 



attempts in r. Thus, Pr(#-send = 2 



2t 



J2 fcPr(#-send = A; I ^) < 



+ k\ A) <{2-f- 7^) ril . So we have 

OO 



E M27 

oo 

E ((2fc + l) + (2fc + 2))(27-7')'+' 

oo 

(4fc + 3)(27-7y+' 



0. 



Cp5 

So we may ignore the last part as well. Thus E s(#-send \ A) a 2 
have ES'^»(#-send | A) Fr{A) ^ 2 



Since Pr(^) 1, we 



Qr~)(5 

We now focus on E ^(^-send | yl)Pr(^). Recall that for r ^ A, q fails to successfully send 
ACK(m) in r. Consider the following three sets (which is a partition of the set of all runs): 

• Ci = {r : p crashes at time in r}, 

• C2 = {r : p does not crash at time and q crashes at or before time r in r}, and 

• C3 = {r : p does not crash at time and q does not crash at or before time r in r}. 
We now show that these are their probabilities: 

. PT{CinA) = Pp, 

• Pr(C2 nA) = {l- - (1 - = (r + + 0{s^), and 

• Pr(C3nA) = 0{e^). 

First note that Pr(Ci) = pp and Pr(C2) = (1 - - (1 - PqY^^) = (r + l)Pq + 0{e^). 

Furthermore, Ci U C2 C A, since if r G Ci U C2, q does not send ACK(m) successfully before 

crashing. Thus Pr(Ci H A) = (]p and Pr(C2 Ci A) = {t + l)/3g + O(e^). Since, as we showed earlier, 
Pr(^) = 1 - /3p - (r + l)(3g + 0{e^), it also follows that Pr(C3 n ^) = 0{e^). 

Now that we have Pr(Ci n A), let us turn to E^'^''(#-send | CiDA). Note that for r eA,p will 
send messages until it crashes. For r e Ci, p crashes immediately, so #-send(r) = for r G Ci. For 
r G C2, g crashes before it can possibly send any messages, so all the messages are sent by p. Thus 

Pr(#-send = A; | C2) = (1 - Pp)^''-'^'+\l - (1 - Pp)'), 

since p must be up at time {k — 1)6 and crash before time kS to send m exactly k times. So 

eSR. (#-send \C2nA) = E - (1 - (1 - l^p)^) 

k=i 



1 



J \S\ 00 

^EM(i-/?p)Y 



k=l 



(1 - PpY 



(1 - Pp)' 

' Hp 



(i-{i-PpYf 



^-{-^-PpY 

dPp 
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The 0(1) term is there because ^ - = sk ' Sf5,+oU) = [Sfsjlois'^) ' ^^lich is 

0(1), since we assumed that /3p is S(e) for this proposition. 

For r £ C^OA, q might send messages (none of which, however, will get through). Let E'fc = {r G 
C^n A : p crashes at time k}. We have Pr(£'fc) < (1 — f3p)^l3p. Furthermore, E^'^'(#-send | Ek) < 



, since p sends ^ messages in Ej^ and q sends at most that many messages. So we have 



E^R' (#-send | C3 n yl) 



s 

= 53ESR^(#-send I £;fe)Pr(£;fc) 

fe=i 
00 

< E2ffl(l-/3p)'/3p 

k=l 

00 

< E2(Hi)(i-/5p)% 



k=l 



fc=i 



Since we assumed that (3p is G(e), i:^^^^#-send \ C3 n ^)Pr(C3 n ^) = 0(e). RecaU that 



E 



SR 



(#-send I Ci n yl) = 0, so 

ESR'(#-send I A) Pt(A) ES'^'(#-send | C2 n A) Pr(C2 n ^4) ^ ^^^'^ 



This gives us ESR.(#-send) «i ^^^^ + 2 

ise 

cannot possibly finish RECEIVing m before time 2t. We leave details to the reader. | 



as desired. 

UfJp u 

The reasoning for the SR^ case is similar to the SRf case. The only major difference is that q 



Theorem 5.1: Under the cost function c^^^, Protocol SRhh satisfies S3. Furthermore, E(c^^s) ~ 
((1 — ap){l — aq)X + apaq) ^2 ^ c-send + (^t + c-wait^ + ^c-send, where < A < 1. 

Proof: Roughly speaking, the first summand corresponds to the expected per-invocation cost of 
the protocol and the second corresponds to the expected per-invocation cost of the heartbeats. To 
do the analysis carefully, we divide the set of runs into three subsets: 

• Fi = {r : one process is correct and the other eventually crashes in r}, 

• F2 = {r : both processes are correct in r}, and 

• F3 = {r : both processes eventually crash in r}. 
These are their probabilities: 

• Pr(Fi) = ap{l - aq) + aq{l - ap), 

• Pr(F2) = ctpfXq, and 
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. Pr(F3) = (l-ap)(l-a,). 

For r £ Fi, we expect the lone correct process to invoke SRhb infinitely often. All but finitely many 
of these invocations will take place after the other process crashed. Thus the average cost of an 
invocation in r will be 0. For r G F2, on the other hand, both processes are expected to invoke SRhb 
infinitely often and the average cost of the invocation in r is expected to be close to the expected 
cost of a single invocation of SRhb- The computation of the expected cost of an invocation in a run 
in F3 is more delicate. We now examine the details. 

Let Gi be the subset of Fi consisting of runs r in which the correct process tries to invoke the 
protocol infinitely often. Clearly Pr(Gi | Fi) = 1, since the protocol is invoked with probability a 
at each time unit. Moreover, for each run r G Gi, we have 

iim^!t"(-;^) =0, 

t-c« #-SR(r,t) + 1 

since there are only finitely many complete invocations with non-zero cost and there are infinitely 
many complete invocations. Thus, 'E(c^^^ \ Fi) = 0. 

Let G2 be the subset of F2 where there are infinitely many invocations of SRhb- Clearly 

c-send + (t + c- wait. By the Law of Large Numbers, 
, ysis of Proposition p.4| shows that 

hm . z. 

t^^ #-SR(r,t) + l 



Pr(G2 I F2) = 1. Let Z = 2 
for almost all runs r of G2, the ana^ 



(Note that we have rH — 2^ instead of 2r as in Theorem LI. This is because in the current setting. 



the expected amount of time elapsed between the start of an invocation and the arrival of the first 



HBMSG is ^-y^. In the setting of Theorem ^j, however, the first hbmsg cannot arrive until time 



r, since the invocation starts at time and the first hbmsg is sent at time 0. Note that in both 
cases, the expected time of waiting is r plus the expected time elapsed between the start of the 
invocation and the arrival of the next hbmsg.) Thus Pr(c'^^^(r) ~ Z \ F2) = 1. 

We now turn our attention to F3. Let -F3(ii, i2, ^i, ^2, ^3) be a subset of F3 with the following 
properties: 

• the first crash in r happens at time ti, 

• the second crash in r happens at time ^2) 

• the number of invocations starting before time ti — 3t — 5 is ii, 

• the number of invocations starting between times ti — 2>t — 5 and ti + r is 22, and 

• the number of invocations starting after time ti + r is 23. 

It is clear that each of these sets are measurable. (Some of them are empty, so they will have 
probability 0; we could introduce restrictions to rule out the empty ones, but leaving them in is 
not a problem.) 

Suppose -^3(^1, t2) ^1, ^2) ^3) is not empty. Then 

T^^^avg I ,7. /, , • • . k + K,{ti,t2,il,i2,h)i2 ^ 

E(c *^ F3(ii,t2,n,«2,^3))~ ——. — —. — — ; Z, 

«1 + i2 + ^3 + 1 

where < ^(ti, ^2) ^ii "^2) ^3) < 1- Roughly speaking, the expected cost of an invocation in the first 
group is Z, since if no messages are lost (which happens with probability 1), the number of 
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messages sent is exactly 2 ^ and the time of waiting is between r and t + 6 — 1, depending 
on when the first HBMSG arrives after the invocation starts. If no messages arc lost, a HBMSG 
is received every S time units, so the wait for a HBMSG is on average. Thus the first group 
of invocations contribute iiZ to c-SR(r), on average. As for the second group, they contribute 
something less than to c-SR(r) on average; in many of these invocation, the first process crash 
(which happens at most 3t + S after the beginning of an invocation in the second group) may reduce 
the time of waiting or the number of messages sent. That is why we have a multiplicative constant 
K{ti,t2,ii,i2,i3) in front of i2- The last group of invocations all have zero cost, since by the time 
they started, the surviving process (which must be the invoker) will never receive any new HBMSGs 
from the crashed process; so the time of waiting and the number of messages sent are both zero. 
Thus we have 



E(e^s I F^-j 



^ E(c^^s I F3(^l,^2,^l,^2,i3))Pr(F3(^l,^2,n,^2,^3)) 



tl 1*2 ,11,12 ,13 



n + «2 + ^3 + 1 



Let 



A 



En + K(il,i2,il,«2,«3)«2 , . • • XX 
■ ■ ■ . Pr(F3 ii,i2,^i,^2,^3 • 

«1 + «2 + «3 + 1 



tl,t2,il,i2,23 

Clearly A < 1 and E(c^^s | F3) \Z, as desired. 

Now we turn to the expected heartbeat costs per invocation. Each process will send a HBMSG 
every 6 time units for as long as it is up. So if in r a process is up at time t, then it sent | 

HBMSGS in r up to time t. Suppose r G F2. Then, #-HBMSG(r, = 2 | , and by the Law of Large 
Numbers, for all ij > 0, 



Thus, 



Pr lim |#-SR(r,t) - 2tcr| < r]t 



/ #-HBMSG(r,t) _ 1 
#-SR(r,t) + l Sa 



Fo]=l. 



1. 



Next, suppose r e Fi. Then one of the processes will send only finitely many HBMSGs and 
invoke SRhb finitely often. Thus after the crash, we have 



1 + ^ 



#-HBMSG(r, t) _ 

#-SR(r,t) + 1 ~ I2 + I1 + I' 

where H is the number of times the crashed process sends HBMSG in r, Ii is the number of times 
the crashed process invoked SR^b in r, and I2 is the number of times the live process invoked SRhb 
in r. For all 77 > 0, we have that 



Pr ( lim 1/2 - icr| < ?7t i^i ) = 1 

\t— »oo / 



Thus, 



Pr I lim 



#-HBMSG(r, t) 



1 



t-»oo #-SR(r,t) + 1 da 



F\=l. 
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Finally, consider the set -F3, where both processes crash. Again, the situation here is more 
complicated, since there are only finitely many complete invocations and HBMSGs in each run, so 
we cannot resort to the Law of Large Numbers. Let -F3(j, k) be the set of runs where p crashes at 
time j and q crashes at time k. Clearly PT{Fs{j, k) \ F3) = (1 — — Pq)''PpPq and the number 



of heartbeats sent in runs of F3 [j, k) is 
Observe that 



+ 



3+k I I 



+ 



. Let #-HBMSG-^(r) = lim^^oo ^ggff'glf • 



Thus, 



a\l - a) 



j+k-i ( j + k 

\ i 



+ 



k j+k 

a{j + k + l)fr'^ ^ ' 



j + k + l 
i + l 



+ 



j+k+l 

a{3+k + l) f^^ ^ ^ 



j+k+i-i (j + k + l 



+ 



a{j + k + l) 



(1 - (1 - ay+^+^) 



E(#-HBMSG^''S I F3) 



Note that 



^E(#-HBMSG^^e I Fsij,k))FiiF^ij,k)) 

'k 

^ (1 - (1 - - PpYil - 



E 



+ 



a{j + k + l) 



E 



+ 



a(j + k + l) 



E 



+ 



aij + k + l) 



{l-(3py{l-P,)^(3pP, 



{l-ay+''+\l-Ppy{l-(3,)''(3pP,. 



+ 



a{j + k + l) 

for some constant Li (roughly ^). Thus the second summand above is bounded above by 

Li/3p/3,(l - a) 



L,PMl - a) ^((1 - a){l - Pp)yi{l - a){l - f3,)f 



{a + Pp-af5p){a + pg-aPg) 
Lif3pPg{l - a) 
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which is 0(£^). Thus we can ignore the second summand. Taking L{j, k) 
get that 



+ 



j+fc+i 
5 ' 



we 



- a{j + k + l) 
3 



It clearly suffices to show that the second summand above is 0(e). Note that jqi^qij < 

i > similarly, jq;:^^:^ < ^/^a, if > Finally, it is clear that -p;^ < 1 for all j,k > 0. 

Call the second summand above S. Since L{j, k) < 2, we have that 



aS < E E(l-/^f)'(l-/^<^)%/59 



J k> 



+ E E 

< 2(^+^) + 2^/;^. 

Since we assumed that and are both 0(e) for this theorem, the second summand above 
is 0{e). Thus, E(#-hbmsg^^s | ^^3) ~ ^. it follows that E(#-hbmsg^^s) ^, as desired. I 
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